Now that you have a solid understanding of how a class module can be written and organized and how properties and methods work, it's time to learn something more about the intimate nature of objects in Visual Basic.
This could seem a rather silly question, but it isn't. The first answer that springs to mind is this: An object variable is a memory area that holds the object's data. This definition evidently derives from the resemblance of objects to UDT structures (which also hold aggregate data), but unfortunately it's completely wrong. The fact that these are two separate concepts becomes clear if you create two object variables that refer to the same object, as in:
Dim p1 As New CPerson, p2 As CPerson p1.CompleteName = "John Smith" Set p2 = p1 ' Both variables now point to the same object. Print p2.CompleteName ' Displays "John Smith" as expected. ' Change the property using the first variable. p1.CompleteName = "Robert Smith" Print p2.CompleteName ' 2nd variable gets the new value! |
If objects and UDTs behaved in the same way, the last statement would have still returned the original value in p2 ("John Smith"), but it happened that the assignment to p1 in the second to last line also affected the other variable. The reason for this behavior is that an object variable is actually a pointer to the memory area where the object data is stored. This is an important concept that has a lot of interesting, and somewhat surprising, consequences, as you'll see in the list below.
The last point implicitly raises a question: When is an object actually released? It turns out that Visual Basic destroys an object when no object variables are pointing to it:
Sub TryMe() Dim p1 As CPerson, p2 As CPerson Set p1 = New CPerson ' Creates object "A" p1.LastName = "Smith" Set p2 = p1 ' Adds a 2nd reference to "A" Set p1 = New CPerson ' Creates object "B", but doesn't ' release "A", pointed to by p2 p1.LastName = p2.LastName ' Copies a value, not an object ref Set p2 = Nothing ' Destroys the original "A" object End Sub ' Destroys the second "B" object |
As you see, keeping track of how many variables are pointing to a given object can easily become a daunting task. Fortunately, it's Visual Basic's problem, not yours. Visual Basic solves it using the so-called reference counter, which I'll talk about in the next section.
Figure 6-8 shows how a typical object is laid out in memory. The Visual Basic programmer sees just a few object variables: In this example, we have two variables, P1 and P2, which point to an instance of the CPerson class, and a third variable P3 that points to a distinct instance of the same class. Anytime Visual Basic creates a new instance of the class, it allocates a separate, well-defined area of memory (the instance data area). The structure and size of that area is fixed for any given class and depends on how many properties the class exposes, the types of properties, as well as other factors of no interest in this context. The structure of this area hasn't been documented by Microsoft, but fortunately you don't need to understand what data is stored there or how it's arranged.
Figure 6-8. The structure of objects is probably more complex than you had anticipated.
One piece of information is especially important, however, for all OO developers: the reference counter. It's a four-byte memory location that always holds the number of object variables pointing to that particular instance data block. In this example, the John Smith object has a reference counter equal to 2, while the Anne Brown object has a reference counter equal to 1. It's impossible for this counter to contain a value less than 1 because it would mean that no variable is pointing to that specific object, and the object would be immediately destroyed. Anyway, keep in mind that, as far as programmers are concerned, the reference counter is an abstract entity because it can't be read or modified in any way (using orthodox programming techniques, at least). The only changes that you can legitimately make to the reference counter are indirectly increasing and decreasing it using Set commands:
Set p1 = New CPerson ' Creates an object and sets ' its reference counter to 1 Set p2 = p1 ' Increments the reference counter to 2 Set p1 = Nothing ' Decrements the reference counter back to 1 Set p2 = Nothing ' Decrements the reference counter to 0 ' and destroys the object ' (Or you can let p2 go out of scope....) |
At the end of the instance data block are the values of all the class module's variables, including all module-level variables and Static variables in procedures (but excluding dynamic local variables, which are allocated on the stack during each call). Of course, these values vary from instance to instance, even though their layout is the same for all instances of the class.
Another undocumented piece of information in the instance data block is very important: It contains the VTable pointer. This 32-bit memory location can be found at the top of the instance data block and is a pointer to another key memory area named VTable. All objects that belong to the same class point to the same VTable; hence the first 4 bytes in their instance data blocks always match. Of course, the first 4 bytes for objects instantiated by different classes differ.
The VTable is what actually characterizes the behavior of a class, but in itself it's a remarkably small structure. In fact, it's just a sort of jump table, a series of Long pointers to the actual compiled code. Each pointer corresponds to a function, a sub, or a Property procedure and points to the first byte of the compiled code generated for each procedure during the compilation process. Read/write properties have two distinct entries in the VTable, and Variant properties might have up to three entries if you also provided a Property Set procedure. Because it's impossible to know at compile time where the application can find a free block of memory to load the compiled code into, the address of each compiled routine is known only at run time. For this reason, the VTable structure is dynamically created at run time as well.
The first time Visual Basic creates an object of a given class, its run-time module performs the following sequence of operations (here in a simplified form):
This long sequence has to be performed only the very first time your code creates an object of a given class. For all subsequent objects of the same class, steps 1 and 2 are skipped because the VTable is already in place. And when you're simply assigning object variables (that is, Set commands without a New clause), step 3 is also skipped and the whole operation becomes just an assignment of a 32-bit value.
Let's see now what happens when the client code invokes an object's method or property. Here we're examining only one of the many possible cases, which is when you're using an object from a class that resides in the same project. Since the compiler knows how the class is arranged, it also knows what the VTable of that class looks like. Of course, it isn't possible to know at compile time where the class's compiled code will be loaded, but at least its structure can be determined when the class is compiled. Therefore, the compiler can safely translate a reference to a property or a method into an offset in the VTable. Because the first seven items in the VTable are usually taken by other addresses (of no interest in this context), the first property procedure or method defined in the class has an offset equal to 28 (7 items * 4 bytes each). Let's say that in our class, this offset corresponds to the Property Get FirstName procedure. When the client code executes this statement:
Print p1.FirstName |
here's what more or less happens behind the scenes:
It's a long trip just to retrieve a value, but this is how things work in the marvelous world of objects. Knowing all this won't help you write better code, at least not immediately. But I'm sure that you will badly need this information some time in the future.
TIP
As a rule, allocating and releasing the instance data block for an object is a relatively slow operation. If your class module executes a lot of code in its Class_Initialize event—for example, it has to retrieve data from a database, the Registry, or an INI file—this overhead can become critical. For this reason, try to keep an instance alive by assigning it to a global object variable and release it only when you're sure that you won't need that object anymore. (Or let Visual Basic automatically set the variable to Nothing when the application comes to a natural termination.) You might also provide a special method—for example, Reset—that reinitializes all private variables without having to create a new instance.
When no more object variables point to a given instance data block, the object is destroyed. Just before releasing the memory, the Visual Basic runtime invokes the Class_Terminate event procedure in the class module—if the programmer created one. This is the routine in which you place your clean-up code.
Visual Basic never goes further than that, and for example, it doesn't release the VTable either even if there isn't any other object pointing to it. This is an important detail because it ensures that the next time another object of this class is created, the overhead will be minimal. There are just a couple of other things that you should know about the termination phase:
In the previous section, I've emphasized that the application invokes methods and properties using offset values in the VTable and little else. This makes all object references really efficient because the CPU has to perform only some additions and other elementary steps. The process of obtaining the offset in the VTable from the name of a property or method is known as binding. As you've seen, this process is usually performed by the compiler, which then delivers efficient code ready to be executed at run time. Unfortunately, not all object references are so efficient. Let's see how we can embarrass the compiler:
Dim obj As Object If n > = 0.5 Then Set obj = New CPerson Else Set obj = New CCustomer End If Print obj.CompleteName |
As smart as it is, the Visual Basic compiler can't determine what the obj variable will actually contain at run time, and, in fact, its contents are entirely unpredictable. The problem is that even though the CPerson and CCustomer classes support the same CompleteName method, it hardly ever appears at the same offset in the VTable. So the compiler can't complete the binding process and can store in the executable code only the name of the method that must be invoked at run time. When execution finally hits that line, the Visual Basic runtime queries the obj variable, determines which object it contains, and finally calls its CompleteName method.
This is dramatically different from the situation we saw before, when we knew at compile time exactly which routine would be called. We have three different types of binding.
Early VTable Binding The early binding process is completed at compile time. The compiler produces VTable offsets that are then efficiently used at run time to access the object's properties and methods. If the property or the method isn't supported, the compiler can trap the error right away, which means that early binding implicitly delivers more robust applications. Early binding is used whenever you use a variable of a well-defined type. You have an indirect confirmation that an object variable will use early binding when you append a dot to its name: The Visual Basic editor is able to give you a list of all the possible methods and properties. If the editor can do that, the compiler will later be able to complete the binding.
Late Binding When you declare an object variable using an As Object or As Variant clause, the compiler can't deduce which type of object such a variable will contain and can therefore store only information about the property's or the method's name and arguments. The binding process is completed at run time and is performed any time the variable is referenced. As you might imagine, this takes a lot of time, and moreover there's no guarantee that the variable contains an object that supports the method you want. If the actual object doesn't support the method, a trappable run-time error will occur. If you have a generic As Object variable, appending a dot to its name in the code module doesn't invoke IntelliSense's drop-down list of properties and methods.
Early ID Binding For the sake of completeness, I have to let you know about a third type of binding, whose behavior falls between that of the previous two. In the case of early ID binding, the compiler can't derive the actual offset in the VTable, but at least it can check that the property or method is there. If so, the compiler stores a special ID value in the executable code. At run time, Visual Basic uses this ID for a very quick look in the object's list of methods. This is slower than early VTable binding, but it's still much more efficient than late binding. It also ensures that no error occurs because we know with certainty that the method is supported. This type of binding is used for some external objects used by your application—for example, all ActiveX controls.
The easy rule is, therefore, that you should always strive to use early binding in your code. Apart from robustness considerations, late binding adds a performance penalty that in most cases you simply can't afford. Just to give you a broad idea, accessing a simple property using late binding is about two hundred times slower than with the most efficient early binding! When the called code is more complex, this gap tends to be reduced because early binding affects only the call time, not the execution of the code inside the method. Even so, you can hardly consider the difference in performance negligible.
Finally note that the way you declare an object variable affects whether Visual Basic uses early binding or late binding, but you have no control over which type of early binding Visual Basic uses. You can be sure, however, that it always uses the most convenient one. If the object is defined inside the current application, or its library exports the information about how its VTable is structured, Visual Basic uses the more efficient VTable binding; otherwise, it uses early ID binding.
Armed with all the intimate knowledge about objects that I've now given you, you should find it very simple to understand the real mechanism behind a few VBA keywords.
The New keyword (when used in a Set command) tells Visual Basic to create a brand-new instance of a given class. The keyword then returns the address of the instance data area just allocated.
The Set command simply copies what it finds to the right of the equal sign into the object variable that appears to the left of it. This value can be, for example, the result of a New keyword, the contents of another variable that already exists, or the result of an expression that evaluates to an object. The only other tasks that the Set command performs are incrementing the reference counter of the corresponding instance data area and decrementing the reference counter of the object originally pointed to by the left-hand variable (if the variable didn't contain the Nothing value):
Set P1 = New CPerson ' Creates an object, stores its address Set P2 = P1 ' Just copies addresses Set P2 = New CPerson() ' Lets P2 point to a new object, but also ' decrements the reference counter ' of the original object |
The Nothing keyword is the Visual Basic way of saying Null or 0 to an object variable. The statement
Set P1 = Nothing |
isn't a special case in the Set scenario because it simply decreases the reference counter of the instance data block pointed to by P1 and then stores 0 in the P1 variable itself, thus disconnecting it from the object instance. If P1 was the only variable currently pointing to that instance, Visual Basic also releases the instance.
The Is operator is used by Visual Basic to check whether two object variables are pointing to the same instance data block. At a lower level, Visual Basic does nothing but compare the actual addresses contained in the two operands and return True if they match. The only possible variant is when you use the Is Nothing test, in which case Visual Basic compares the contents of a variable with the value 0. You need this special operator because the standard equal symbol, which has a completely different meaning, would fire the evaluation of the objects' default properties:
' This code assumes that P1 and P2 are CPerson variables, and that ' Name is the default property of the CPerson class. If P1 Is P2 Then Print "P1 and P2 point to the same CPerson object" If P1 = P2 Then Print "P1's Name and P2's Name are the same" |
You can test the type of an object variable using the TypeOf...Is statement:
If TypeOf P1 Is CPerson Then Print "P1 is of type CPerson" ElseIf TypeOf P1 Is CEmployee Then Print "P1 is of type CEmployee" End If |
You should be aware of a couple of limitations. First, you can test only one class at a time, and you can't even directly test to see whether an object is not of a particular class. In this case, you need a workaround:
If TypeOf dict Is Scripting.Dictionary Then ' Do nothing in this case. Else Print "DICT is NOT of a Dictionary object" End If |
Second, the preceding code works only if the Scripting library (or more in general, the referenced library) is currently included in the References dialog box. If it isn't, Visual Basic will refuse to compile this code. This is sometimes a nuisance when you want to write reusable routines.
TIP
You often use the TypeOf ...Is statement to avoid errors when assigning object variables, as in this code:
But here's a faster and more concise way:
' OBJ holds a reference to a control. Dim lst As ListBox, cbo As ComboBox If TypeOf obj Is ListBox Then Set lst = obj ElseIf TypeOf Obj Is ComboBox Then Set cbo = obj End If
Dim lst As ListBox, cbo As ComboBox On Error Resume Next Set lst = obj ' The assignment that fails will leave Set cbo = obj ' the corresponding variable set to Nothing. On Error Goto 0 ' Cancel error trapping.
The TypeName function returns the name of an object's class in the form of a string. This means that you can find the type of an object in a more concise form, as follows:
Print "P1 is of type " &; TypeName(P1) |
In many situations, testing an object's type using the TypeName function is preferable to using the TypeOf...Is statement because it doesn't require that the object class be present in the current application or in the References dialog box.
The fact the object variables are just pointers can puzzle many a programmer when object variables are passed to a procedure as ByVal arguments. The familiar rule—a procedure can alter a ByVal value without affecting the original value seen by the caller—is obviously void when the value is just a pointer. In this case, you're simply creating a copy of the pointer, not of the instance data area. Both the original and the new object reference are pointing to the same area, so the called procedure can freely read and modify all the properties of the object. If you want to prevent any modifications of the original object, you must pass the procedure a copy of the object. To do so, you must create a new object yourself, duplicate all the properties' values, and pass that new object instead. Visual Basic doesn't offer a shortcut for this.
That said, you need to understand that there's a subtle difference when you declare an object parameter using ByRef or ByVal, as this code snippet demonstrates:
Sub Reset(pers As CPerson) ' ByRef can be omitted. Set pers = Nothing ' This actually sets the original End Sub ' variable to Nothing. Sub Reset2(ByVal pers As CPerson) Set pers = Nothing ' This code doesn't do anything. End Sub |
When you pass an object using ByVal, its internal reference counter is temporarily incremented and is decremented when the procedure exits. This doesn't happen if you pass the object by reference. For this reason, the ByRef keyword is slightly faster when used with objects.
Visual Basic fires the Class_Terminate event one instant before releasing the data instance block and terminating the object's life. You usually write code for this event when you need to undo things that you did at initialization time or during the life of the instance. Typically in this event procedure, you close any open files and release Windows resources obtained through direct API calls. If you want to make the object's properties persist in a database for a future session, this is where you usually do it. All in all, however, you'll rarely write code for this event or at least you'll need it less frequently than code for the Class_Initialize event. For example, the CPerson class module doesn't actually require code in its Class_Terminate event procedure.
On the other hand, the mere fact that you can write some executable code and be sure that it will be executed when an object is destroyed opens up a world of possibilities that couldn't be exploited using any other, non-OOP technique. To show you what I mean, I've prepared three sample classes that are almost completely based on this simple concept. It's a great occasion to show how you can streamline several common programmer tasks using the power that objects give you.
CAUTION
Visual Basic calls the Class_Terminate event procedure only when the object is released in an orderly manner—that is, when all references pointing to it are set to Nothing or go out of scope, or when the application comes to an end. This includes the case when the application ends because of a fatal error. The only case when Visual Basic does not invoke the Class_Terminate event is when you abruptly stop a program using the End command from the Run menu or the End button on the toolbar. This immediately stops all activity in your code, which means that no Class_Terminate event will ever be invoked. If you inserted critical code in the Terminate events—for example, code that releases Windows resources allocated via APIs—you'll experience problems. Sometimes these are big problems, including system crashes. By the same token, never terminate a program using an End statement in code: This has exactly the same effect, but it's going to create problems even after you compile the application and run it outside the environment.
Programmers commonly change the shape of the mouse cursor, typically to an hourglass, to inform the user that some lengthy operation is going on. Of course, you also have to restore the cursor before exiting the current procedure; otherwise, the hourglass stays visible and the user never realizes that the wait is over. As simple as this task is, I've found that a good number of commercial applications fail to restore the original shape under certain circumstances. This is a clear symptom that the procedure has exited unexpectedly and therefore missed its opportunity to restore the original shape. How can classes and objects help you avoid the same error? Just have a look at this simple CMouse class module:
' The CMouse class _ complete source code Dim m_OldPointer As Variant ' Enforce a new mouse pointer. Sub SetPointer(Optional NewPointer As MousePointerConstants = vbHourglass) ' Store the original pointer only once. If IsEmpty(m_OldPointer) Then m_OldPointer = Screen.MousePointer Screen.MousePointer = NewPointer End Sub ' Restore the original pointer when the object goes out of scope. Private Sub Class_Terminate() ' Only if SetPointer had been actually called If Not IsEmpty(m_OldPointer) Then Screen.MousePointer = m_OldPointer End Sub |
Not bad, eh? Just eight lines of code (not counting comments) to solve a recurring bug once and for all! See how easy it is to use the class in a real program:
Sub ALengthyProcedure() Dim m As New CMouse m.SetPointer vbHourglass ' Or any other pointer shape ' ... slow code here ... (omitted) End Sub |
The trick works because as soon as the variable goes out of scope, the object is destroyed and Visual Basic fires its Class_Terminate event. The interesting point is that this sequence also occurs if the procedure is exited because of an error; even in that case, Visual Basic releases all the variables that are local to the procedure in an orderly fashion.
Another common programming task is opening a file to process it and then closing it before exiting the procedure. As we've seen in Chapter 5, all the procedures that deal with files have to protect themselves against unanticipated errors because if they were exited in an abrupt way they wouldn't correctly close the file. Once again, let's see how a class can help us to deliver more robust code with less effort:
' The CFile class--complete source code Enum OpenModeConstants omInput omOutput omAppend omRandom omBinary End Enum Dim m_Filename As String, m_Handle As Integer Sub OpenFile(Filename As String, _ Optional mode As OpenModeConstants = omRandom) Dim h As Integer ' Get the next available file handle. h = FreeFile() ' Open the file with desired access mode. Select Case mode Case omInput: Open Filename For Input As #h Case omOutput: Open Filename For Output As #h Case omAppend: Open Filename For Append As #h Case omBinary: Open Filename For Binary As #h Case Else ' This is the default case. Open Filename For Random As #h End Select ' (Never reaches this point if an error has occurred.) m_Handle = h m_Filename = Filename End Sub ' The filename (read-only property) Property Get Filename() As String Filename = m_Filename End Property ' The file handle (read-only property) Property Get Handle() As Integer Handle = m_Handle End Property ' Close the file, if still open. Sub CloseFile() If m_Handle Then Close #m_Handle m_Handle = 0 End If End Sub Private Sub Class_Terminate() ' Force a CloseFile operation when the object goes out of scope. CloseFile End Sub |
This class solves most of the problems that are usually related to file processing, including finding the next available file handle and closing the file before exiting the procedure:
' This routine assumes that the file always exists and can be opened. ' If it's not the case, it raises an error in the client code. Sub LoadFileIntoTextBox(txt As TextBox, filename As String) Dim f As New CFile f.OpenFile filename, omInput txt.Text = Input$(LOF(f.Handle), f.Handle) ' No need to close it before exiting the procedure! End Sub |
I'll conclude this chapter with a simple class that you'll probably find useful when debugging dozens of nested procedures that call one another over and over. In such cases, nothing can preserve your sanity more effectively than a log of the actual sequence of calls. Unfortunately, this is easier said than done because while it is trivial to add a Debug.Print command as the first executable statement of every procedure, trapping the instant when the procedure is exited is a complex matter—especially if the procedure has multiple exit points or isn't protected by an error handler. However, this thorny problem can be solved with a class that counts exactly eight lines of executable code:
' Class CTracer - complete source code. Private m_procname As String, m_enterTime As Single Sub Enter(procname As String) m_procname = procname: m_enterTime = Timer ' Print the log when the procedure is entered. Debug.Print "Enter " &; m_procname End Sub Private Sub Class_Terminate() ' Print the log when the procedure is exited. Debug.Print "Exit " &; m_procname &; " - sec. " &; (Timer - m_enterTime) End Sub |
Using the class is straightforward because you have to add only two statements on top of any procedure that you want to trace:
Sub AnyProcedure() Dim t As New Ctracer t.Enter "AnyProcedure" ' ... Here is the code that does the real thing ...(omitted). End Sub |
The CTracer class displays the total time spent within the procedure, so it also works as a simple profiler. It was so easy to add this feature that I couldn't resist the temptation.
This chapter introduced you to object-oriented programming in Visual Basic, but there are other things to know about classes and objects, such as events, polymorphism, and inheritance. I describe all these topics in the next chapter, along with several tips for building more robust Visual Basic applications.